A metric index for approximate string matching
نویسندگان
چکیده
منابع مشابه
A Metric Index for Approximate String Matching
We present a radically new indexing approach for approximate string matching. The scheme uses the metric properties of the edit distance and can be applied to any other metric between strings. We build a metric space where the sites are the nodes of the suffix tree of the text, and the approximate query is seen as a proximity query on that metric space. This permits us finding the occ occurrenc...
متن کاملFast index for approximate string matching
We present an index that stores a text of length n such that given a pattern of length m, all the substrings of the text that are within Hamming distance (or edit distance) at most k from the pattern are reported in O(m+ log log n + #matches) time (for constant k). The space complexity of the index is O(n1+ǫ) for any constant ǫ > 0.
متن کاملCache-Oblivious Index for Approximate String Matching
This paper revisits the problem of indexing a text for approximate string matching. Specifically, given a text T of length n and a positive integer k, we want to construct an index of T such that for any input pattern P , we can find all its k-error matches in T efficiently. This problem is well-studied in the internal-memory setting. Here, we extend some of these recent results to external-mem...
متن کاملApproximate String Matching Using a Bidirectional Index
We study strategies of approximate pattern matching that exploit bidirectional text indexes, extending and generalizing ideas of [6]. We introduce a formalism, called search schemes, to specify search strategies of this type, then develop a probabilistic measure for the efficiency of a search scheme, prove several combinatorial results on efficient search schemes, and finally, provide experimen...
متن کاملMetric Indexes for Approximate String Matching in a Dictionary
We consider the problem of finding all approximate occurrences of a given string q, with at most k differences, in a finite database or dictionary of strings. The strings can be e.g. natural language words, such as the vocabulary of some document or set of documents. This has many important application in both offline (indexed) and on-line string matching. More precisely, we have a universe U o...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Theoretical Computer Science
سال: 2006
ISSN: 0304-3975
DOI: 10.1016/j.tcs.2005.11.037